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ABSTRACT 

We investigate the intrinsic stellar populations (estimated total numbers of 
OB and pre-main-sequence stars down to 0.1 M 0 ) that are present in 17 massive 
star-forming regions (MSFRs) surveyed by the MYStlX project. The study is 
based on the catalog of >31,000 MYStlX Probable Complex Members with both 
disk-bearing and disk-free populations, compensating for extinction, nebulosity, 
and crowding effects. Correction for observational sensitivities is made using the 
X-ray Luminosity Function (XLF) and the near-infrared Initial Mass Function 
(IMF)—a correction that is often not made by infrared surveys of young stars. 

The resulting maps of the projected structure of the young stellar populations, in 
units of intrinsic stellar surface density, allow direct comparison between different 
regions. Several regions have multiple dense clumps, similar in size and density to 
the Orion Nebula Cluster. The highest projected density of rs./ 34,000 stars pc 2 is 
found in the core of the RCW 38 cluster. Histograms of surface density show dif¬ 
ferent ranges of values in different regions, supporting the conclusion of Bressert 
et al. (2010, B10) that no universal surface-density threshold can distinguish 
between clustered and distributed star-formation. However, a large component 
of the young stellar population of MSFRs resides in dense environments of 200- 
10,000 stars pc -2 (including within the nearby Orion molecular clouds), and we 
find that there is no evidence for the B10 conclusion that such dense regions 
form an extreme “tail” of the distribution. Tables of intrinsic populations for 
these regions are used in our companion study of young cluster properties and 
evolution. 


1 Department of Astronomy & Astrophysics, 525 Davey Laboratory, Pennsylvania State University, Uni¬ 
versity Park, PA 16802, USA 

2 Instituto de Fisica y Astronomia, Universidad de Valparaiso, Gran Bretaha 1111, Playa Ancha, Val¬ 
paraiso, Chile 


2 Millennium Institute of Astrophysics 



2 


Introduction 


The Milky Way Galaxy is a critical part in the universe for studying star formation. 
Only here can the populations of low-mass stars—making up the vast majority of stars— 
be resolved and the full spatial structure of young stellar clustering and molecular clouds 
be analyzed, revealing detailed infor mation about how star formation progresses within a 
region. Most stars, including the Sun flGonnelle fe Mevnetll2012l; iDukes fc Krnmholzl 120121 1 . 
are born in clusters with OB-type stars, so it is important to study the massive star-forming 
regions (MSFRs) in the solar neighborhood. The young stellar clusters in these regions can 
be precursors to open clusters, but most of their stars become gravitationally unbound due 
to gas expulsion, so an understanding the star-formation histories and early cluster dynamics 
in these regions provides c lues about how bound clusters and field stars are produced (e .g., 
Goodwin &; Bastian 2006 : Pfalzner 2011 : Kruiissen et al. 2012 : Baneriee h Kroupa 2014 1. 


Historically, studies of massive Galactic star-forming regions have been hindered by 
difficulties inherent to Galactic Plane astronomy; in particular, field stars g reat ly outnumber 
members of the sta r-forming region in optical or infrared (IR) images (e.g., King et al.ll2013c 
K uhn et al. 201314 1. X-ray surveys readily detect star-forming region members due to the 
high X-ray luminosities from strong magnetic activity of pre-main-sequence stars (Lx ~ 
10~ 3 Lbob Preibisch et al. 2005a), and these surveys are not strongly effected by nebulosity, 
obscuration, or crowding. Excess IR emission from disk-bearing young stars has proven to be 
another useful method of establishing membership, but IR-only surveys will miss the large 
populations of members without dusty protoplanetary disks and such studies rarely account 
for observational sensitivities in determining intrinsic stellar populations, as we attempt to 
do in this paper. The combination of X-ray selected stars and IR-exc ess se l ected stars can 


provide better sampl es of stars in MSFRs than either method alone (IFeigelson et al.l 12013 


Townsle v et ah] 201ll) . Thus, the analysis of empirical distributions of inferred stellar mass 
and X-ray lumin osit ies from the combined samples can be used to estimate total populations 
(e.g., Getman et ah 2012, and references therein). 


The Massi ve Young Star-Forming Complex Study in Infrared and X-ray (MYStlX; 
Feigelson et al. 2013) examines 20 nearby massive star-forming regions (MSFRs) using a com¬ 
bination of archival Chandra X-ray imaging, 2MASS+UKIDSS near-IR (NIR), and Spitzer 
mid-IR (MIR) survey data. The catalogs of young stars include both high-mass and low- 
mass stars, and disk-bearing and disk-free stars flBroos et ah 2013). We use this sample of 
stars to study the intrinsic populations of young stars across 17 of the MYStlX MSFRs. 

The present study is closely based o n the constructions of the MYStlX Probable Com¬ 
plex Members catalog (MPCM; Broos et al.|2013 ) and the statistical segregation of MPCMs 


into ~140 subclusters by Kuhn et al. (12014 . Paper I). Our principal objective here is to 
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overcome sensitivity limitations of the MPCM catalog for each MYStlX MSFR in order to 
normalize the observed stellar distributions to intrinsic stellar distributions. We obtain two 
quantities of interest: the total intrinsic stellar population and the stellar surface densities 
in each MYStlX subcluster. The total populations are important inputs into a multivariate 
analysis of young cluster properties in our forthcoming study (Kuhn et al. 2014b, Paper III). 
The stellar surface densities address long-standing issues about typical environments in which 
stars form. 


1.1. Thresholds for Clustered Star Formation 


The statistical relationships that traditionally underlay our und ersta ndings of star- 
formation processes have been scale-free relationships like the Salpeter (119551) stellar initial 
mass function (IMF) and the Ke nnicutt-Schmid t law relating global gala ctic star forma¬ 
tion rate to interstellar material (ISchmidtl Il959l: iKennicntt fc Evan, si 1201211 . Nevertheless, 
important preferred scales for star formation were later found. The power-law IMF for high 
stellar masses peaks around 0.2-0.3 M 0 and declines for lower mass stars (Chabrier 2003). 
And, in the Galactic neighborhood, a threshold for star formation was found at Ay & 7 
magni tudes dust absorption (~ n\H?] ~ 3 x 10 4 cm~ 3 ) associated with Galactic disk sta¬ 


bility ( Johnson et al.lI2004J: iLadall2010l: Martin fc Kennicuttll2001[ ISchavel 120041: iLerov et ah 


20081 ). although this rule is not applicable with i n the inner 0.5 kpc of the Galaxy where 
star-formation is comparatively suppressed (iLong more et ah 2 01 3). 


The surface density of stellar populations in star-forming regions, in units of stars per 
square parsec, has been a property of interest for the field of star-cluster formation (e.g., 


Carpenter 2000 : Lada fe Lada 2003 : Allen et ahll2007 : Jorgensen et al.ll2008 : Gutermuth et al. 
2009I ). The surface densities of young stars can also h ave astr ophysical implications, such 
as tidal truncation of protoplanetary disks (IPfalzner et al.l 1200511 . binary star distributions 
(Bate 2009a: Moeckel fe Clarke 201111. or the survival of clusters after molecular cloud dis¬ 


persal (Kruijssen 2012). Bressert et ah (120101 . henceforth B10) have recently examined the 
shape of the distribution of stellar surface densities in star-forming regions. They use sam¬ 
ples of disk/envelope-bearing stars identified through IR excess in regions within 0.5 kpc of 
the Sun, including th e Gould Belt ( Allen et ah 2006lh the Ori on A and B molecular clouds 


(iMeeeath et al.ll2012h . t he Taurus molecul ar cloud (IRebnll et al .11201 Orl . and the regions from 
the cores-to-disks (c2d; Evans e:Lah 20 031) project—thus their sample is dominated by low- 
mass young-stellar-object (YSO) environments. B10 argue that if “clustered” star-formation 
and “distributed” star-formation were two distinct star-formation processes, then “clustered” 
and “distributed” populations should appear as distinct modes in the surface-density dis- 
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tribution (cf. iGieles et al.l 120121; iPfalzner et al.l 120121) . and that such a scenario could be 


tested by searching for a “scale” of star formation separating these two modes. Such a kink 
in the surface-density distribution has been used in some investigations of the structure of 
young stellar clusters, for example by Gutermuth et al. (2009), to separate distinct clusters 
of stars in star-forming regions. Given that BIO find a smooth distributions of surface den¬ 
sity from their data, which they report is adequately fit by a log-normal distribution with a 
peak at 22 stars pc -2 , they conclude that no such “scale” exists, at least for low-mass YSO 
environments present in the solar neighborhood. 


Nevertheless, the log-normal distribution from BIO is not scale free, but instead a peak 
at 22 stars pc^ 2 and width of 0.85 dex suggests a density distribution weighted towards 
low-density environments. If this result is a general characteristic of most star formation in 
the Galaxy, rather than just the nearby regions investigated by BIO, it would have impli¬ 
cations for theories of star formation as numerous researchers have discussed. For example, 


Parmentier fe Pfalznerl (120131 1 find that their models of local-density-driven sta r formation 
from a single molecular clump could produce the 22 stars pc' 2 scale from B10. King et ah 


(2012) suggest that surface densities significantly gr eater than 22 sta rs pc 2 could indicate 


that a cluster has undergone a “cool collapse phase.” Kru ijs sen e t ah (2012) present a model 


in which the low fraction of star-formation that results in bound clusters is, in part, a result 
of a density spectrum weighted towards low surface densities. Parker et ah (2011) notes that 
dynamical processing of primordial binaries by clusters depends on whether most stars form 
in low density re g i ons as su g geste d by B10, or whether higher density clusters are more com¬ 
mon. And, de Juan Ovelar et ah (2012) investigate the threshold densities in star-forming 
regions where stellar interactions affect habitable planet formation—and the fraction of the 
stars born in environments above their 3 x 10 3 stars pc -2 threshold would depend on whether 
high-density regions are just a tail of the B10 log-normal or a different mode not seen in 
the B10 sample. B10 support this interpretaton of the empirical results from their sample 
stating, “only a small fraction (<26 per cent) of stars form in dense clusters where their 
formation and/or evolution is expected to be influenced by their surroundings.” 


It is important to investigate whether these results continue to hold for more massive 
star-forming regions (those regions containing O-type stars). Nearly 70% of BlO’s sample 
comes from the Orion Giant Molecular Clouds, which do contain O-type stars, but they note 
that the IR-excess methods used are ineffective at identifying young stars in the presence 
of nebulosity and crowding in this complex. Their other star-forming regions are lower 
mass. Given that studies of the mass function for star-forming regions favor the birth of 
stars in more massive complexes, investigation of these complexes in a way that is more 
effective at probing the densest regions could be helpful for determining the validity of BlO’s 
suggestion that clustered stars (E > 200 stars pc -2 ) exist in the tail of the surface-density 
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distribution, rather than being a dominant component. Nevertheless, a definitive study of 
surface-density distributions for star formation would require the construction of an unbiased 
survey of all star-forming environments. MYStlX neither includes the most massive star¬ 
forming environments in the Galaxy (such as W 43, Wd 1, NGC 3603, or the Arches Cluster) 
nor includes large angular area studies of diffuse molecular clouds necessary to capture the 
lowest surface density environments. 


1.2. MYStlX 


The MYStlX survey (Feigelson et al. 2013) differs from many previous studies in that 
it focuses on relatively massive star-forming regions lying in nearby Galactic spiral arms, 
and supplements samples of IR-excess young stars with X-ray selected young stars and 
spectroscopically identified OB stars. 

MYStlX is a survey of 20 of the nearest (d < 3.6 kpc) MSFRs that have been ob¬ 
served with NASA’s Chandra X-ray Observatory , the Spitzer Space Telescope , and the 
United Kingdom Infra-Red Telescope (UKIRT) or the Two Micron All Sky Survey (2MASS; 
Skrutskie et al] 20061). The MYStlX regions include the Orion Nebula, the Flame Nebula, 
W 40, RCW 36, NGC 2264, the Rosette Nebula, the Lagoon Nebula, NGC 2362, DR 21, 
RCW 38, NGC 6334, NGC 6357, the Eagle Nebula, M 17, W 3, W 4, the Carina Nebula, 
the Trifid Nebula, NGC 3576, and N GC 1893 , from which a sample of 31,784 MYStlX Prob¬ 
able Complex Members (MPCMs; Broos et ah 2013 1 is obtained. The MPCM catalog thus 
consists of young stars that are X-ray selected, IR excess selected, or OB stars from the litera¬ 
ture. MYStlX provides the cleanest and largest lists of young stars for most of the 20 regions 
included in the study, so these catalogs should be scientifically useful for different purposes. 
One of the requirements of the MYStlX project was to use sensitive and homogeneous data 
analysis procedures for all 20 regions to facilitate inter-comparisons between regions. Spe¬ 
cial procedures had to be developed to deal with chall enges working in th e Galactic Plane, 


as des c ribed in the MYSt l X technical-catalog p a pers: Kuhn et al.l (I2013alh iTownslev et al. 


i201j)). King et a 


and Broos et al 


(2013 1. Kuhn et al. (2013b), Navlor et al. f 2013h 7 Povich et al. f 2013h . 
"d2013h. 


m 


Th e spat ial distributions of MPCMs in 17 of the MYStlX MSFRs are investigated 


Ku hn et ah (2 0141 . henceforth Paper 1) and in this work. Three regions are omitted, 


NGC 3576, W 3, and W 4, because they lack JHK UKIRT photometry and have a low 
match rate between X-ray sources and sources from the 2MASS catalog. We use a subset 
of the MPCM sources (~17,000 stars) produced after X-ray selected MPCMs are pruned to 
a uniform X-ray sensitivity within each region (Paper I). This eliminates artificial surface 
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density gradients associated with differing X-ray exposure times in Chandra mosaics and the 
sensitivity variatio ns w ithin each Chandra field due to telescope coma and vignetting, the 
“egg-crate effect” flTownslev et ah 2 0111) . To prune a region, we remove sources with X-ray 
photon fluxes ( log_PhotonFlux_ t; Broos et ah 120131 ) that are lower than the completeness 
limits provided in Table 1 of Paper I. 

Nevertheless the resulting observed surface densities, used by the analysis in Paper I, do 
not contain the entire intrinsic population, differ in sensitivity from MSFR to MSFR, and are 
affected by spatially variable N H absorption and mid-IR sensitivity. As pre-m ain-sequence 
(PMS) X-ray luminosities strongly scale with stellar mass (jTelleschi et al. 20071), inconsistent 
X-ray sensitivities due to differing Chandra exposure times and MSFR distances lead to 
different samplings of the cluster IMFs. We overcome these selection effects calibrating the 
observed X-ray luminosity function (XLF) and IMF distributions to the Orion Nebula Cluster 
(ONC) that serves as a template for young cluster populations, rather than attempting to 
model instrumental and observational effects. 


The organization of this paper (Paper II) is as follows. We analyze the IMF and X-ray 
luminosity function (XLF) to infer intrinsic populations from observed young stellar popu¬ 
lations (Section 2). We derive intrinsic stellar surface density maps from these populations 
(Section 3), and investigate surface density distributions (Sections 4 and 5). The MYStlX 
sample of star-forming regions are typically richer than those in the sample studied by BIO. 
The MYStlX MSFRs exhibit a large diversity in their surface density distributions (ranging 
from ~10 to ~30,000 stars pc -2 ), neither showing a tendency to follow a universal surface 
density distribution nor showing a convincing peak at some characteristic surface density. 
These results are discussed in Sections 4 and 5 and summarized in the conclusion (Section 6). 


2. Stellar Populations 


The completeness limits and detection fraction^ of the MPCM samples vary from re¬ 
gion to region, due to differences in distance, obscuration, and X-ray and IR observation 
exposures. Extrapolations of the total numbers of stars in a region, which we infer empiri¬ 
cally from observed MPCM samples based on comparisons with the ONC, are necessary for 
comparison of intrinsic properties of stellar populations in different star-forming regions. 


lr The completeness limit of a sample is defined to be the minimum luminosity (or mass) such that nearly 
100% of objects with greater luminosity (or mass) are included in the sample. The detection fraction is the 
number of observed sources N a b s divided by the intrinsic number of sources N tot . 
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2.1. X-ray Luminosity Functions 


Young stars in the Chandra Orion Ultradeep Project (COUP; Getman et ah 2005b) had 
large numbers of X-ray counts, allowing X-ray luminosities (L t , c ; total 0.5-8.0 keV band and 
absorption cor rected) t o be obtained by parametric modeling of the X-ray spectrum using the 
XSPEC code (jArnaudl 1996). As the MYStlX stars are mostly too faint for this procedure, 
X-ray luminosities for other MPCMs were co mputed using non-pa r ametric calibrat i ons fo r 
PMS stars ( XPHO T; Getman et al. 2010 ) by Kuhn et ah ( 2013a ). Townslev et al. ( 2014 ). 
Broos et al. ( 2011 ). and Kuhn et al. ( 2010 ). 


The probability distribution of L tjC , called the XLF, is associated wit h the IMF due 


to th e statistical link betwe e n X-ray luminosity and stellar mass M (e.g., IFeigelson et al. 


1993 


Preibisch et al 


(IFeigelson fe Getman 


2005a 

Tcllcschi et al. 

2007) 

2005b 

has been used 

to et 


l ar clusters, inclu ding Cep B (IGetman et al. 


The assumption of a “universal XLF” 
o est imate to tal populations in young stel- 
20061) . M 1 7 dBroos et ahll2007h. NG C 6357 


(Mu cciarelli et al.l 1201 ll) . NGC 1893 (ICaramazza et al.l 120121) . and IC 1396 (IGetman et al. 


( Wang et al. 2007). Rosette (IWang et al. 20081 20091. 2Qlob. W 40 ( Kuhn et al. 2010 ). Trum- 
nler 15 dWang et al.l |20 111) . Trumnle r 16 dWolk et all l201lh . Sh 2-254/25 5/256/257/258 


20121 ). D uring PMS stellar evolution, t here is a weak relati on between X-ray luminosity and 


age (e.g., IPreibisch fe Feigelsonl 120051: IPandev et al.l I2014J) : however, L t c does not rapidly 


decline during the first 5 Myr, unlike the rapid decrease in bolometric luminosity Lboi during 
PMS evolution along the Hayashi t rack. I nstead, L t _ c ~ M appears to be the fundamental 
relationship rather than L tjC ~ Lboi ( Getman et al. 2 014 b). Thus, X-ray luminosity evolution 
appears to have little effect on the shape of the PMS XLF (e.g., Bhatt et ah 2013). 


Following these previous studies, we use a sample of stars from COUP to approximate 
the probability distribution of the universal XLF. The COUP study contains a sample of 
839 lightly absorbed stars (IFeigelson et al. 200 51). which are identified as the members of the 
ONC, while the more highly absorbed stars are identified as being embedded in the Orion 
Molecular Cloud (OMC) behind the ONC. These lightly obscured COUP stars are complete 
down to _a mas s of 0.1-0.2 M & (with partial coverage into the proto-brown-dwarf regime; 
Preibisch et al. 2005bh and show an XLF shape characterized by a falling distribution at 
high luminosities with a break to an approximately flat distribution at luminosities below 


J t,C 


10^u.4 er g s -i Henceforth, we label this distribution the COUP XLF. The tail with X- 
ray luminosities greater than this turnover can be fit with a power-law (Pareto) distribution 
of slope a, with a minimum variance unbiased estimator a* and variance Var(cU) given by 
the equations 


a* = - 1 - 


n 


n 


Ya=i (In^i - In %r 


and Uar(cU) = 


a 


*2 


n- 3' 


( 1 ) 




























































































































where x m is the X-ray luminosity of the turnover point and x, is the X-ray luminosity of 
the zth source in a sample of n sources in the distribution tail (Johnson et al. 1994). Thus, 
a* = —0.9 ±0.1 with n_ = 61, wh ile the distribution is roughly flat in logarithmic bins below 
the turnover point. Mucciarelli et ah (20111) have also found similar L t c Pareto distributions 
for the ONC and the Sh 2-254-258 regions. 


The L t c distributions for the MYStlX regions can be compared to the empirical dis¬ 
tribution function (EDF) of 839 lightly absorb ed C OUP sources to further investigate the 
universality of the XLF shape. For example, Wan g et alj (120081 . their Figure 9b) shows that 
the XLF cumulative distributions in the Rosette star-forming region agree with the COUP 
XLF above the X-ray luminosity completeness limit for the Rosette sample. For this analysis 
it is necessary to be cautious about how completeness limits are treated because differen¬ 
tial absorption can change the apparent shape of the XLF if the sample of highly absorbed 
sources is incomplete. For example, a sample of sources that are more deeply embedded will 
have a higher mean luminosity than a sample of sources from the same observation that are 
unabsorbed, which could lead to a flattening of the power-law of the combined distribution. 


In Figure [U we show the COUP EDF running from unity at very low luminosities 
(L t ,c ~ 10 2 * ' erg s -1 ) to zero for the most luminous PMS star in the Orion Nebula held. Note 
that spectroscopically identified OB stars have been removed from both the COUP XLF and 
from MYStlX samples considered here because MYStlX regions can differ widely in their 
massive stellar subpopulations. The L t c EDFs for the other regions are shown below, with 
arbitrary vertical spacings. Only the portion of the XLF where the sample is complete is 
shown. (The completeness limit for the full sample is set to the completeness limit for the 
most heavily obscured subpopulation in the region.) In general there is excellent agreement 
between the shapes of the different lines. Some curvature can be seen in the COUP XLF 
between 10 30 ' 5 erg s -1 and 10 32 " 5 erg s _1 , which is also reflected in the shapes of the XLFs 
for other regions. The nearest regions tend to probe a lower-luminosity section of the XLF, 
while the more distant regions tend to probe a higher-luminosity section of the XLF. Due 
to the XLF curvature, the XLF shape appears less steep for nearer regions and steeper for 
more distant regions. Table [Tj (Column 2) gives the power-law indiceqj (for the full sample) 
calculated over the regions shown in Figured! This confirms the trend in which more distant 
regions have steeper slopes—not because of differences in intrinsic XLF shape, but due to 
differences in the available portion of the XLF. 


2 The power-law fits are often poor and we do not recommend that these values be used for astrophysical 


interpretation. 
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Fig. 1.— The EDF of X-ray luminosities, L tc . for the 839 lightly obscured, low-mass stars 
from COUP (complete down to 0.1-0.2 M 0 ) is shown by the thick gray line. The black lines, 
from top to bottom, are the X-ray luminosity EDFs for the other MYStlX regions. Vertical 
shifts of 0.3 units are used to separate the different lines for visual clarity. Lines end at 
the completeness limit for the full sample (i.e. the completeness limit for the most heavily 
obscured subpopulation in the region). 
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Tabic 1. XLF properties of MSFRs 


Region 

a* 

N tot 

NmE< 1.5 

Ni.5<ME<2.5 

NmE> 2.5 



(stars) 

(stars) 

(stars) 

(stars) 

(1) 

(2) 

( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 

Orion 

-0.9 

2600 

700 

1200 

720 

Flame 

-1.0 

800 

55 

250 

500 

W 40 

-1.1 

520 

0 

250 

270 

RCW 36 

-0.8 

550 

0 

230 

320 

NGC 2264 

-1.0 

1900 

1100 

230 

570 

Rosette 

-1.3 

2500 

840 

1000 

610 

Lagoon 

-1.1 

3800 

2000 

870 

930 

NGC 2362 

-1.1 

600 

570 

30 

0 

DR 21 

-1.0 

2900 

50 

550 

2253 

RCW 38 

-1.3 

9900 

70 

1900 

8000 

NGC 6334 

-1.2 

9400 

180 

2900 

6200 

NGC 6357 

-1.2 

12000 

440 

7400 

3700 

Eagle 

-1.3 

8100 

1300 

4300 

2500 

M 17 

-1.3 

16000 

170 

5400 

11000 

Carina 

-1.5 

34000 

9500 

17000 

7000 

Trifid 

-1.0 

3100 

1500 

1100 

470 

NGC 1893 

-1.5 

4600 

1700 

2400 

590 


Note. — Column 1: Name of MYStlX MSFR. Column 2: Power- 
law index (d log N/d log L t c ) for the portion of the XLF tail shown 
in Figure [2] (uncertainties are ±0.1). Column 3: Inferred intrinsic 
total population. Columns 4-6: Inferred intrinsic numbers of stars 
with ME < 1.5 keV, 1.5 < ME < 2.5 keV, and ME > 2.5 keV, 
respectively. (Due to rounding, the sum of Columns 4, 5, and 6 is 
not always equal to Column 3.) 






2.1.1. Intrinsic Numbers of Stars 


If we accept the assumption of a “universal XLF” with the COUP sample serving as a 
template, the total stellar population of a star-forming region may be extrapolated from the 
missing stars at low luminosities where the region’s observed XLF has dropped to zero due 
to incompleteness, but where the existence of these stars may be inferred from the universal 
XLF shape. To perform this calculation, the COUP XLF histogram is scaled so that it 
matches the observed XLF in the section of the XLF where the observed sample is complete. 
The completeness limit for the full sample of young stars will be the completeness limit for 
the most absorbed subpopulation of stars in the region. However, few of the most deeply 
embedded protostars may be observed, so this population will be poorly characterized, and 
the inferred total populations may be considered lower limits that do not necessarily account 
for all undetected, embedded stars. 


If we further assume that X-ray absorptions from the molecular cloud are independent 
of the intrinsic stellar X-ray luminosities, we can estimate the intrinsic numbers of stars 
with different amounts of absorption in the star-forming region—this is useful because it 
allows us to compare total populations in more extinguished parts of a star forming region 
to total populations in less extinguished areas. Using this assumption, any subset of stars 
selected by X-ray absorption will be drawn independently from the “universal XLF” and 
therefore have the same XLF shapes—allowing us to perform the XLF population analysis 
described above on the subset. This assumption may not be entirely true; for example, 
mass segregation may cause more X-ray luminous stars to lie preferentially in the center of 
a cluster where absorptions are often higher. There is also weak evidence for a factor of ~2 
systematic inc rease i n X-ray luminosity from the younger Class I to the older Class II III 
systems OPrisinzano et al.l 200811 . Nevertheless, the absorption stratified MYStlX XLFs all 
show consistency with the COUP XLF, indicating that these are not major effects. 

Interstellar medium absorption of MPCMs may be evaluated using J — H color indices 
or X-ray Median Energy (ME: iGetman et ah j2010J) indicators, both of which increase as 
absorbing columns increase (IGetman et al.li201 4b). The spread in absorptions for subclusters 
of stars is provided in Paper I (their Figure 8), which shows the median J — H and ME 
for clusters of young stars in the MYStlX regions. There is a clump of data points with 
ME 1.5 keV (the unembedded population), while the absorbed population ranges over 
1.5 < ME < 5.0 keV. The bulk of the absorbed groups of stars are moderately absorbed, with 
1.5 < ME < 2.5 keV, and tend to have physical properties similar to the unabsorbed clusters. 
In contrast, the most absorbed groups of stars are much more com pact (r mr p k, Q.Q8 pc) and 


tend to be centered on dense molecular filaments or clumps ( Getman et ah 2014b). 


We choose to stratify the MYStlX MPCM stars into three absorption strata using ME 
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divisions at 1.5 and 2.5 keV. This captures different aspects of the spatial structure of the 
stars: roughly, the population of stars outside the molecular cloud, the population within the 
molecular cloud, and the population associated with dense molecular filaments and cores. 
Figure [2] shows the observed XLF histograms for these three subpopulations in NGC 6357 
(black lines): the ME < 1.5 keV sources (left), the 1.5 < ME < 2.5 keV sources (center), 
and the ME > 2.5 kev sources (right). The completeness limits in this region vary from 
10 30 ' 3 , to 10 30 ' 6 , to lO 31,0 , for these three strata, respectively. The template COUP XLF 
(gray lines) is scaled to these three populations, using a vertical shift that minimizes the 
area between the two EDFs in the range where the XLF is approximately complete. The 
XLFs of all three strata show agreement between the shape of the sample XLF and the 
COUP XLF. The intrinsic number of stars in each absorption stratum can be estimated 
by integrating the number of stars under the scaled universal XLF curve, i.e. the gray 
COUP line. Thus, the inferred intrinsic population in NGC 6357 is ~440 lightly obscured 
stars, ~7400 moderately obscured stars, and ~3700 highly obscured stars|§ Similar plots of 
stratified XLFs are provided for the other 16 regions as a figure set in the electronic version 
of the article. 

As expected, for every region, the more highly absorbed strata have a higher L t c com¬ 
pleteness limit than the less absorbed strata. Generally, there is good agreement with shape 
of the COUP XLF; however, in most cases the sample becomes incomplete before reaching 
the turnover point at L t ,c ~ 10 30 ' 4 * erg s^ 1 in the XLF. The Flame Nebula is one of the few 
cases where the completeness limit is less luminous than the XLF turnover point in the lightly 
and moderately absorbed strata, so the “flat” portions of the XLFs can be compared—the 
figure indicates that the XLFs are consistent. There is also indication of this turnover in the 
lightly absorbed stratum for NGC 2264. 

The extrapolated intrinsic stellar populations for each stratum in each region are pro¬ 
vided in Table [1] These values are combined to produce estimates of the total intrinsic 
populations for each star-forming region (Column 3). These values of lV t ot are the princi¬ 
pal empirical results of this study. Effects of distance and observational sensitivity should 
be approximately corrected by this analysis, but inferred total populations for more dis¬ 
tant, less complete regions will be less precise and may be affected by additional system- 
atics. To investigate these effects, we simulate the X-ray sensitivity for the Orion Nebula 


3 We repeated this analysis using alternate ME divisions to investigate whether these choices have any 

systematic affect on the total populations of stars that we infer. Using 1.0, 1.25, 1.5, 1.75, and 2.0 keV for 

the first division and 2.0, 2.25, 2.5, 2.75, and 3.0 keV for the second division, we find that inferred total 

numbers of stars (Table 1, column 3) only change by ~3%, and there is no systematic trend towards over or 
underestimation. 
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region if it were at the distance of the Carina Nebula (~2.3 kpc) with the pruned X-ray 
sensitivity (log Phot on FluxJ > — 5.9 ) for th e Chandra Carina Complex Project (CCCP; 


Townslev et al.l 2011). Getinan et ah (j2005al. their Table 3) report 21 close doubles with 
<2" separations in COUP, which would be indistinguishable at the distance of Carina. For 
this test, we combine their X-ray luminosities into a single source for the purpose of X-ray 
sensitivity limits and XLF analysis. From the 120 X-ray sources in this reduced-sensitivity 
Orion sample (out of 1216 original X-ray sources), we infer a total of ~2000 stars rather than 
~2600 stars. Thus our N tot values may underestimate the true populations by up to ~30%, 
with most of the missing stars from the highly obscured ME stratum. Effects of distance 
and observational sensitivity are mentioned in Sections 3 and 4. 
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Fig. 2.— The XLF for NGC 6357 X-ray selected MPCMs with (left) ME < 1.5 keV, (center) 
1.5 < ME < 2.5 keV, and (right) ME > 2.5 keV. Stars with known spectral types earlier 
than B3 are not included. The COUP XLF of 839 lightly absorbed stars is scaled to these 
populations. Completeness limits for the three absorption strata are indicate by the vertical 
dashed line. (The complete figure set (17 images) is available in the online journal) 
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2.1.2. Detection Fraction for X-ray Selected MPCMs 


The MPCM X-ray detection fraction, / = N 0 \ )S /N tot , can be used as a correction factor 
to convert observed surface densities of young stars into intrinsic surface densities. However, 
dividing the surface density of a region by a single detection fraction does not account for 
spatial variation in sensitivity. The statistical sample of X-ray selected young stars provided 
in Paper I, is a pruned subset of the MPCMs to which a uniform photon-flux limit is applied to 
correct for observational sensitivity effects, including Chandra telescope vignetting, variation 
in point-spread function, and differing net exposure times in a mosaicked field. However, 
their samples do not control for differing luminosity completeness limits due to variable 
extinction. 


An improved estimate of stellar surface densities can be made using the absorption- 
stratified samples from Section 12. 1.11 Each ME stratum has a narrower range of absorptions 
than the full sample, so the spatial variation in sensitivity within these samples will be lower 
than for the full sample. The sources in each stratum are adaptively smoothed (described 
below) to produce surface density maps, and the detection fraction for each stratum given 
by the XLF analysis. 


The spatially dependent detection fractions for the full Paper I X-ray selected samples 
are computed using the equation, 


/mi( a, 5) 


/iSi(a, S ) + / 2 S 2 (a, + /3^3( a ; <5) 

S 1 (a,5) + S 2 («,h) + S 3 («,5) 


( 2 ) 


where /,; is the detection fraction of the Ah stratum and Xj(cq 5) is the adaptively smoothed 
observed surface density of that stratum. Detection fraction tends to be roughly constant 
across the fields of view, with dips found where molecular clouds and cores are located. But, 
detection fraction varies strongly between regions, so it is essential that scientific comparison 
between regions be based on the intrinsic young stellar surface densities rather than observed 
MPCM surface densities^ 


Figure [3] shows detection fraction maps for the Carina, Eagle, and NGC 6334 fields 
of view. The larger distance and shorter Chandra exposure times for the Carina Nebula 
compared to the Eagle Nebula result in a lower overall detection fraction for Carina compared 
to Eagle. In the Eagle Nebula, the bubble around the main NGC 6611 cluster results in a 
relatively high detection fraction, while the embedded subclusters to the north-east have 


4 The alternative ME strata (the two divisions ranging from 1.0-2.0 and 2.0-3.0 keV) produce maps with 
the same overall morphology as those reported here, without much change in completeness fraction over 
most of the field of view, but with 10-15% difference at the extreme values of the map. 
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lower detection fractions. A molecular filament passes through NGC 6334 from north-east 
to south-west, producing a notable trough in the detection fraction map of this region. 
Overall, detection fractions for MYStlX MSFRs range between 1-60%. 

Table [2]lists the intrinsic stellar populations inferred from the X-ray MPCM populations 
for the 142 subclusters in Paper I, counting stars out to 4 times the subcluster core radiusj^ 
The detection fractions for these subclusters were obtained by interpolating the detection 
fraction at the location of each star assigned to a subcluster and taking the mean value, 
weighted to correct for incompleteness of the sample. The intrinsic population of a subcluster 
is the number of observed stars in that subcluster from Paper I, multiplied by the fraction 
of those stars that are X-ray selected, and divided by the detection fraction of the X-ray 
selected sample. 


5 The numbers of observed stars reported in Paper I count all the stars from the center of a subcluster 
out to 4 core radii. We use this cutoff because the projected half-mass radii of the subclusters are difficult 
to measure. If numbers for other radii are desired, they can be calculated from these data using Equation 3 
in Paper I to obtain a correction factor. We define = 4.0 x r c , where r c is the “isothermal ellipsoid” core 
radius in Paper I. 
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Fig. 3.— The inferred fractional completeness of the MPCM catalogs in various regions. 
The fraction of young-stellar members that we expect to include in our MPCM list at any 
point is indicated by the shading of the maps, with dark shades indicating low detection 
fractions and light shades indicating high detection fractions as shown on the colorbar. (The 
data behind this figure is provided with the electronic edition of this article.) 
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2.2. Initial Mass Functions 


The J vs. J — H diagram has been used to obtain mass estimates of youn g stars, a nd 


thereby IMFs for young clusters, in a numb er of multiwavelength studies fe.g.. lGetman et al. 


20081: iKuhn et al.ll2010t IWright et al.l 120100 . For low ma ss stars, individual inferred masses 


typically have 30% systematic errors flKuhn et al.l 12010 , their Appendix A); however, this 


uncertainty is relatively small compared to the range of stellar masses in these samples 
and unimportant when the scientific questions concern their collective distributions. The 
method may be b iased for the youngest stars: the spectral energy distribution modeling 
by Povich et ah] (1201311 indicates that NIR absorption from a heavy disk or envelope may 
substantially increase the J-band magnitude, causing a star to appear to have a lower mass 
than it truly does. This method does not account for the mass of circumstcllar material, 
which is substantial for protostars. And, furthermore, this method is insensitive to very 
young protostars undetected in NIR bands. 


For each MYStlX subcluster, we adopt the median age from Get man et al. (]2014bi )— 
thei r Aqe.ix esti mates where these exist, otherwise their Ageju estimates—and use the 
Siess et ah| (119971 ) numerical pre-main sequence evolutionary models to estimate absorption 
and mass for each star. A completeness mass-limit is estimated empirically from the in¬ 
ferred stellar masses for each subcluster. The Maschberg er 020131) IMF is then scaled to 
the complete end of the observed mass functions, and the number of missing stars down to 
0.1-0.2 M 0 is extrapolated. 


Table [2] lists the intrinsic numbers of stars calculated from IMF analysis for each of the 
142 Paper I subclusters. While there are numerous potential sources of error in this analysis, 
at a minimum there is a counting-statistics uncertainty of N tot / \JN where N tot is 
the inferred intrinsic population and N M>M u m is the number of stars used to scale the IMF. 


Figure [4] plots the intrinsic population from IMF analysis vs. the intrinsic population 
from XLF analysis on a log-log plot—the points fall along the y = x line with a root-mean- 
square of ~0.25 dex, although some points may deviate by up to a factor of ~3. IMF-inferred 
populations for sparser subclusters are slightly systematically higher than the XLF-inferred 
populations, but, overall, there is little systematic shift or tilt in the relation. Subclusters 
that are more highly absorbed tend to have fewer members, but they do not show any more 
or less deviation from the y — x line than the unabsorbed subclusters. 


P ov ich et ah (2 0111) find that IR-derived populations estimates, which include the IR 
excess selected stars that lack detected X-ray counterparts, produce results ~20% higher 
than the estimates that just use the X-ray selected sources. However, this offset is small 
relative to other sources of uncertainty in population estimation, and is not apparent on the 
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log-log plot in Figure [4] Another source of uncertainty in Figure |4] is the comparison of the 
completeness limit in L tc to a mass completeness limit. The sample of 839 COUP stars is 
complete down to 0.1-0.2 M 0 , so there is some uncertainty in determining which mass limit 
to use for best comparison between XLF and IMF inferred populations. 


log N [stars] (IMF) 



log N [stars] (XLF) 


Fig. 4.— Intrinsic numbers of stars within the MYStlX subclusters estimated via the IMF 
(ordinate) vs. XLF (abscissa). The y = x line where both methods produce the same 
estimate is indicated. 




Table 2. Intrinsic Population Estimates from the XLF and IMF 


Properties from Paper I 


Subcluster 

a 

5 

^4, major 

^*4, minor 

PA 

log iVxLF 

log Nim f 


(J2000) 

(J2000) 

(arcmin) 

(arcmin) 

(deg) 

(stars) 

(stars) 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

Orion A 

83.8110030 

-5.3752777 

0.42 

0.32 

85 

1.66A0.13 


Orion B 

83.8154178 

-5.3897248 

1.93 

1.35 

28 

2.17A0.10 


Orion C 

83.8195378 

-5.3761802 

10.17 

5.19 

5 

3.21A0.02 

3.37A0.09 

Orion D 

83.8242661 

-5.2763330 

7.60 

1.20 

12 

1.94A0.12 


Flame A 

85.4270870 

-1.9037960 

5.14 

3.24 

146 

2.74A0.03 

2.91A0.43 

W40 A 

277.8614542 

-2.0940426 

4.54 

4.37 

107 

2.48A0.04 

2.66A0.09 

RCW 36 A 

134.8623491 

-43.7555688 

3.47 

2.32 

122 

2.73A0.04 

2.27± 0.06 

RCW 36 B 

134.8634966 

-43.7571938 

1.01 

0.15 

23 

1.66A0.11 

1.94A0.31 

NGC 2264 A 

100.1312445 

9.8311532 

1.35 

1.16 

136 

1.21A0.15 

0.94A0.12 

NGC 2264 B 

100.1545919 

9.7918914 

0.56 

0.31 

124 

1.12±0.21 



Note. — Column 1: Subcluster name from Paper I. Columns 2-3: Celestial coordinates (J2000) 
for the subcluster center. Columns 4-5: Semi-major and semi-minor axes for an ellipse 4 times the 
size of the subcluster core defined in Paper I. Column 6: Position angle of the subcluster ellipse in 
degrees east from north. Column 7: Intrinsic number of stars projected within four subcluster core 
radii estimated from the XLF analysis. Column 8: Intrinsic number of stars projected within four 
subcluster core radii estimated from the IMF analysis. A value of “ • • • ” indicates that a subcluster 
has too few stars with good JH photometry to estimate a mass-completeness limit and/or the IMF 
scaling. Estimates of uncertainty include the y/~N Poisson uncertainty, but do not include the multiple 
sources of systematic error in the XLF and IMF analysis. (A full version of this table is available in 
the electronic edition of this article—a stub is provided here to show form and content.) 
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3. Intrinsic Stellar Surface Density 


Figure [5] shows intrinsic stellar surface density for the 17 MSFRs regions. Ob served 


surfa ce densities for X-ray selected MPCMs are calculated following the IQgata et all (2003 


2004) adaptive-smoothing method and then corrected to the intrinsic populations by dividing 
the observed surface densities by the detection fraction maps. These maps can thus be 
directly compared with each other; Figure [5] shows all regions with the same physical length 
scale (in parsecs based on the distances given bv lFeigelson et ah 20 131) and the same surface- 
density units (in stars per square parsec). A 5-pc length scale is drawn. 

The adaptive smoothing method is based on the Voronoi tessellation as implemented by 
the adaptive, density function in the spatstat CRAN package of the R statistical software en¬ 
vironment (IBaddelev & Turner 20051 ). A randomly selected subset of stars are used to create 
a Voronoi tessellation of the held, which will naturally tend to have smaller cells in regions 
of higher stellar density. The other stars are used to estimate surface density in these cells. 
This procedure can be repeated a large number of times using different subsets to create 
the tessellation and estimate the surface density. To produce our maps, we used a sample 
containing V 0 b s /5 stars to create the tessellation and repeated the procedure 100 times, av¬ 
eraging together the results. This method produces results that are similar to other adaptive 
smoothing meth ods used by ast ronomers, such as fc-th nearest neighbor surface-density es¬ 
timator ( Gutermuth et al.1 1 20081 ) or adaptive kernel density estimator (KDE) methods (e.g., 


Abramson 


1982h . The choice of what fraction of stars to use to build the tessellation for the 


Ogata method is analogous to the choice of k for the fc-th nearest-neighbor method or the 
kernel size for KDE. 


For any non-parametric smoothing algorithm, there is a tradeoff between bias and vari¬ 
ance, with bigger “kernels” leading to smaller variance but larger bias. This effect can lead 
to suppression of peaks in stellar surface density, which is demonstrated by Figure EE When 
one tenth of the points are used to create the Voronoi tessellation leading to a smoother 
map (left) and one fifth of the points are used to create the Voronoi tessellation leading to 
a rougher map (right), the peak in the map is suppressed by a factor of ~2 in the former 
case compared to the latter case@ In contrast, surface densities away from local extrema 
are nearly identical in the two panels. This indicates that these maps (or any other maps of 
stellar surface density using the various methods listed above) are likely to produce biased 
values for maximum surface density, but may be reasonably accurate in regions with smooth 
surface-density gradients. 


6 The central subcluster surface densities from Paper I were obtained by parametric modeling of the 
unbinned data, so they should not be affected by this bias. 
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Fig. 5.— Surface densities in all 17 regions, shown using the identical spatial scale (parsec 
scale given by arrow) and intrinsic surface density scale (stars pc -2 scale given by color bar) 
for each region. (The data behind this figure is provided with the electronic edition of this 
article.) 
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We have run simulations to test the reliability of the Voronoi surface-density technique, 
in particular whether stochastic clumping of spatially random points can produce false peaks 
in the surface-density mapsQ We generate a point process with complete spatial randomness 
of 100 points in a unit square and estimate surface density using the Voronoi method with 
Nobs /5 points to construct the tessellation. This procedure is repeated 10,000 times. We 
find that the root-mean-square uncertainty in surface density at the location of each point 
is 0.1 dex, 1% of points may have surface density values 2 times higher, and 0.01% of stars 
may have surface density values 3 times higher. Given that the surface densities in star¬ 
forming regions vary over more than 4 orders of magnitude, these stochastic variations are 
insignificant, and the observed peaks are likely to be real. 



5 10 15 20 25 30 35 40 45 

Observed Surface Density [stars pc' 2 ] 


Fig. 6.— Surface density maps for the same region (central part of Trifid) with two different 
smoothings: in the left panel 10 points are used per Voronoi cell, while in the right panel 5 
points are used per Voronoi cell. The color bar shows units of observed stars pc -2 . 


3.1. Descriptions of Surface-Density Maps 


The structures seen in the maps in Figure 0 are the projected distributions of stars in 
star-forming regions, so physically discrete groups of stars may overlap each other on the 
map. This is known to be the case for the Orion region—this field of view includes, in order of 
distance along our line of sight, the periphery of the older NGC 1980 cluster ( Alves fc Bouv 


2012) , the PNC, a dense subclu ster containing massive stars (BN/KL: iBecklin fc Neugebauer 


196 71; Kl ei nmann & Low 19671). and stars embedded in the OMC, including the OMC1-S 


7 This issue equally affects the fc-th nearest neighbor and adaptive kernel methods as well. 
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subcluster (Grosso et ah 2005). 
structure in other regions. 


The dumpiness seen in the MYStlX maps hints at such 


The stellar surface density maps in Figure [5] show that most regions have small clumps 
of stars with extremely high surface density, but most of the area of the star-forming regions 
have surface densities well below the maxima. The highest peaks in surface density include 
the core of the RCW 38 cluster (~34,000 stars pc -2 ), Orion (~17,000 stars pc -2 for the ONC 
and ~22,000 stars pc -2 for BN/KL), M 17 (~12,000 stars pc -2 ), the Tr 14 cluster in Carina 
(~10,000 stars pc -2 ), and RCW 36 (~10,000 stars pc -2 ). In the M 17 region, a projected area 
of 9.8 arcmin 2 (= 3.3 pc 2 ) has surface densities greater than 1000 stars pc -2 —substantially 
larger than for any other MYStlX MSFR. While, at the other extreme, the Rosette Nebula 
has an overall low stellar surface density. 


The ONC plays a paradigmatic role in our understa nding of young stellar clusters (e.g., 
Hillenbrand 1997 : Getman et al. 2005b : Ballv et ahlhoQO L and the surface-density maps show 


that it is similar in size and density to many of the densest clumps of stars in other MSFRs, 
even regions in which a much larger physical area has been surveyed. An example of this 
is the NGC 6357 region, which has three dense clusters similar to the ONC. Nevertheless, 
not all of the MYStlX MSFRs regions contain structures comparable to the ONC; neither 
the Eagle Nebula nor the Lagoon Nebula are as centrally concentrated as the ONC, despite 
having larger total young stellar populations. Older MSFRs like NGC 1893 and NGC 2362 
entirely lack dense cluster cores. 


The simulation from Section 12.1.11 of the degradation on the Orion Nebula data if this 
region were observed with the distance and exposure time for Carina (reducing the number 
X-ray sources from 1216 to 120) reveals how results for more distant MYStlX regions might 
be affected by lower sensitivity. The smoothed surface-density map for the simulated obser¬ 
vation has a broadened central core, and, hence, the maximum surface density at the center 
of the ONC is reduced. Most of the highly embedded stars around BN/KL were removed, 
so this subcluster is not seen. However, the range of surface densities outside these peaks 
is not changed significantly. Thus, in other MSFRs, one may expect that some of the small 
but dense subclusters may be missing from our surface-density maps. 


4. Histograms of Surface Density 

The surface density at the location of a star, £*, can provide useful information about the 


environment that young stars experience in s tar-forming regions (e.g., lLada fc Ladal 12003 


Gutermuth et al. 

2005; 

Jprgensen et al. 

2008) 


Dynamical equilibration and violent few- 
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body interactions, for example, will occur only in dense stellar cores. Lo w dens ity regions, in 
contrast, can produce dynamically fragile wide binary systems (e.g., Feigefson et ah 2006). 
For each of the ~17,000 MPCM stars in our sample, E* is interpolated from the intrinsic 
surface-density maps. For each MPCM in this sample, there are an average of (l//f u n — 1) 
undetected young stars at nearly the same location. Therefore, to estimate the intrinsic 
distribution of E* for a star-forming region using a histogram, the number of stars in each 
bin is calculated by weighting a star at location ( a , £) by l//f u ii(cq £). Thus, the summation 
of values in all bins should equal the inferred total number of stars in a region, N tot , rather 
than the number of observed stars, iV 0 bs. 

Figure [7] shows these E* histograms for each of the 17 MSFR, with a bin size of 
0.2 dex. These graphs show that E* ranges over ~4 orders of magnitude from E* ~ 1 
to 30,000 stars pc -2 . The majority of the young stars in the MYStlX survey lie in regions 
with E* = 10-10,000 stars pc" 2 . 

Comparison of the local surface densities in the neighborhoods around stars to the 
average surface density across the entire field of view is useful for quantifying the degree 
of local clustering. This is the princip a l behin d the n eares t-neighbor tes t for clustering in 
spatial-point patterns by Diggle (119831 ). Ripley (1198811 . and Cressiel (119911) . The thick, gray 
lines superimposed on each histogram in Figure [7] show the average surface density over the 
entire field of view. For every MYStlX region, the median surface density at the location of 
stars is greater than the average surface densities across the region by 40-400%, indicating 
strong clustering in all cases. In addition, all regions have at least a few stars in subregions 
with surface densities 1-2 orders of magnitude above the average for the field of view. 

There is much variety in the positions of the histogram peaks (i.e. the mode of the 
distribution). Orion peaks at the highest E* for the MYStlX sample (1000 stars pc -2 ), 
while the peak for Trifid is the lowest (10 stars pc -2 ). The distributions of logS* are often 
asymmetric around the peak. Some regions have a narrower distribution of logS*, like 
Rosette, Lagoon, and Carina, while others have a wider distribution, like Flame, RCW 38, 
NGC 6357, and Eagle. A few regions have statistically significant multimodality, including 
NGC 2362 and Trifid, where the null-hypothesis of a unimodal distribution is rejected by 
Hartigan’s dip test@ at p < 0.01. For Trifid, the different modes correspond to different 
subclusters from Paper I, which have sig nificant ly different mean densities allowing them 
to be distinguished on this diagram (cf. Pfalzner et al.l 2012). The multimodal structure 
that appears to exist in the histograms of Flame, RCW 36, and RCW 38 is only marginally 


8 The hypothesis test for multimodality from Hartigan &; Hartigan ( 19851 ) is implemented in the diptest 
CRAN package of the R statistical software environment (Maechler l2013l ). 



























Fig. 7.— The surface density distributions for each individual star-forming region. The 
histogram bin widths are 0.2 dex, and the x axes are the same for each plot. The gray lines 
indicate the average surface density in the field of view, defined as the total number of stars 
divided by the area of the field of view. Left to right and top to bottom: Orion, Flame, 
W 40, ROW 36, NGC 2264, Rosette, Lagoon, NGC 2362, DR 21, RCW 38, NGC 6334, 
NGC 6357, Eagle, M 17, Carina, Trifid, and NGC 1893. 
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significant (p ~ 0.05). 

There arc limitations to this survey of E* due to the small FOVs of the MYStlX project, 
given that the nearest Galactic star-forming complexes can cover many square degrees on 
the sky. It is important to note that the average surface densities for different regions cannot 
be directly compared with each other because they depend on the size of the field of view; for 
example, the small field of view for Orion only captures the dense region around the ONC, 
while the large field of view for Carina captures a much wider variety of environments. This 
field-of-view selection effect will also influence values of E*, and must be carefully taken into 
consideration when comparing surface densities in two different regions. The less active sites 
of star formation in large star-forming complexes may often lie outside the MYStlX fields of 
view, so the MYStlX survey does not represent star-formation in low surface-density regions 
particularly well. The simulated reduced-sensitivity Orion data (Section I2.1.ip also allows 
us to examine the effect of lower source-detection rates for more distant regions with shorter 
Chandra observations. The histogram produced from the 120 sources in this sample (not 
shown) spans a E* range from 100-6000 stars pc' 2 , with a peak just under 1000 stars pc -2 . 
Although, the maximum E* is reduced, the mode of the distribution is nearly identical to 
the original data. 
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4.1. Comparison of Surface Density Distributions in MYStlX and BIO 


Care must be taken when comparing the results here with the results from BIO due to 
different selection effects on the young stellar samples used for each study. The sky coverage 
for both MYStlX and BIO is the result of multiple projects with different objectives, so 
neither represents an unbiased survey of star-forming environments in the Galaxy. Perhaps 
most important, the BIO survey is based only on infrared-excess stars, and thus misses the 
large X-ray selected disk-free population. In MPCM samples, the stars with no detected IR 
excess ty pically outnumber the disk-bearing subpopulation by 2-3:1, and sometimes more 
than 7:1 flBroos et all 20131. their Table 1). It is therefore not surprising that stellar surface 
densities derived from the MYStlX survey are higher than those derived by BIO. 

One of the main observations of BIO is that the £* distribution of their sample has 
no discrete modes (peaks in the logarithmically binned histogram) corresponding to “dis¬ 
tributed” star formation or “clustered” star formation. As a result, they concluded that the 
histogram of surface densities alone is not sufficient for determining which stars are clustered 
and which stars are not. This assertion is supported bvlGieles et ah (2012) who demonstrate 
that even if all stars are born in clusters, cluster expansion can yield a variety of diffe rent 


surface density distributions, including the log-normal distribution of BIO. Pfalzner et al. 


( 2012 1 additionally demonstrate that for King ( 1962 1 cluster density profiles, the low density 
portion of the cluster outside the dense core can make it difficult to identify distinct peaks 
in the £* distribution, even if they do exist. Our results show that different MSFRs have 
radically different £* distributions (Figure [?]), rather than following a universal distribution. 
It is thus difficult or impossible to identify any dividing surface density threshold from the 
empirical £* data. BIO also see shifts in the peaks of the distributions for the different 
surveys they use, ~7 stars pc~ 2 for the Gould Belt-(-Taurus and ~30 stars pc“ 2 for the c2d 
regions. 

Another conclusion of BIO is that the unimodal log-normal distribution—peaked at 
22 stars pc -2 with standard deviation width of 0.85 dex—means that most stars form outside 
clusters, while densities like 100, 1000, 10,000 stars pc -2 
They choose a surface-density threshold of 200 stars pc 
ter because this is where s tars become likely to interact with their neighbors 


are m the tail of the distribution^ 
~ 2 to be their definition of a clus- 


Gieles et ah 


(2012) and Pfalzner et al. (2012) caution about using projected surface-densities to estimate 
the fraction of stars undergoing local dynamical interactions because analysis of the empir¬ 
ical surface-density histograms for a sample does not take into account cluster evolution, 
radial cluster structure, and superposition of discrete clusters. Thus, they argue that stel- 


9 Nevertheless, B10 note that they are not sensitive to the extreme high £* “tail” for the ONC. 
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lar interaction could be important even if the empirical histograms show most stars lie in 
environments below an astrophysically selected threshold like 200 stars pc~ 2 . 

Both MYStlX and BIO investigate star-formation in the Orion Giant Molecular Clouds, 
including the Orion A and B molecular clouds, containing the Orion Nebula and Flame Neb¬ 
ula, respectively, making this a useful region to directly compare results. The combination of 
the BIO and the MYStlX surface-density distributions for this star-forming region provides 
a less biased estimate of the full surface-density distribution. We have emulated the BIO 
analysis for the larger-_scale Orion s tar- forming region in the areas outside the MYStlX fields 
of views, using the IMegeath et al.1 (2012) catalog of YSOs, with a magnitude limit at IRAC 
[3.6] = 14 mag. The objects excluded due to overlap with the MYStlX fields of view include 
19% of the BIO Orion sample. The distribution of these sources on the sky is shown in the 
left panel of Figure [HI with polygons cut out representing the Flame Nebula MYStlX held of 
view in the north and the Orion Nebula MYStlX held of view in the south. To compute the 
X* histogram for the Orion A and B region (excluding the MYStlX helds for the Orion and 
Flame nebulae), we follow BIO by assuming a uniform disk fraction of 65%, so all densities 
and bin heights are increased by a factor of 1.35. 

The right panel of Figure [8] shows these two histograms, labeled “BIO method” for 
the Orion A and B molecular clouds (excluding the MYStlX Orion and Flame Nebulae 
helds of view) and labeled “MYStlX” for the coadded Orion Nebula and Flame Nebula 
histograms. Both methods attempt to account for intrinsic numbers of stars: in BlO’s case, 
by accounting for missing Class III stars with a uniform correction; in our case, through the 
XLF and IMF methods described in Section 3. At 5 Myr and a distance of 0.414 kpc, a 
PMS star of 0. 1 M& woul d have an L-band (~ the IRAC [3.6] band) photospheric magnitude 
of ~13 mag flSiess et al.l 2000); so BlO’s sample should be sensitive to YSOs down to the 
hydrogen-burning mass limit. Although the “B10 method” histogram is missing 19% of the 
B10 stars due to overlap with the MYStlX regions, the location of its peak is consistent with 
the histogram in Figure 1 of B10. 

The full distribution for the Orion Giant Molecular Clouds in Figure [8] is clearly very 
different from the results in Figure 1 of B10. Furthermore, based on the total numbers of 
stars in these histograms, clustered stars make up one of the dominant components of the X* 
distribution, rather than just being an extreme value “tail” of the distribution. Underestima¬ 


tion of the number of stars in high density regions was also commented on by Pf a lzner et ah 


(2012). The coadded B10 and MYStlX histograms (the dashed line) has some bimodal 
structure, with peaks around 100 and 1000 stars pc -2 and a shoulder around 20 stars pc -2 , 
but this may be an artefact of selection effects in both studies. 


Figure [9] shows X* histograms for just the Flame Nebula region, which are obtained 
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using the BIO method with only IR-excess selected members from Megeath et al. (2012) and 
the histogram obtained from MYStlX using both IR-excess and X-ray selected members. 
Visual inspection of the Spitzer IRAC images reveals that IR sources are likely lost in the 
wings of bright sources and in regions with bright nebulosity. As a result, the IR excess 
only method underestimates both the number of stars in this cluster and the stellar surface 
densities. 

Limitations that affect the IR-excess census of YSOs (outside the dense Flame Nebula 


and Orion Nebulae) inc 


ude variations in Spitzer sensitivity due to the presen ce of nebulos; 


Kuhn et al. 

2010; 

Povich et al. 

2011; 


2007) and variations in the disk fractio n (e.g., iGetman et al.l 12009 


Povich et al.ll201ll; IGetman et al.ll2012l. l2014al) . If a significant numbers of 


young stars were missed in the outer portions of the star-forming complex, it might increase 
the low-density component of the combined histogram to the point where it is comparable 
to the high-density component. Results from a large XMM survey of the Orion A molecular 
cloud (e.g.. iPilli tter i et ah 20131 ) will be able to address some of these limitations. Limita- 
tio ns to t he MYStlX censuses include reduced sensitivity of X-ray selection to protostars 
flPrisinzano et all 12008 ) and highly absorbed clumps of stars and assumptions about a uni¬ 
form XLF and IMF. 

While improved information about the typical environments in which stars form would 
be useful for theoretical models of star-formation and cluster formation and evolution; it is 
difficult to generalize results like the ones here or from B10 to star-formation in the Galaxy 
as a whole. Even within the MYStlX sample, surface density distributions from one re¬ 
gion provide little information about other regions (Figure [Tj). We have demonstrated that 
BlO’s results are likely not valid for the Orion molecular clouds (~70% of their sources). 
But their results are likely to be reasonable for other nearby star-forming clouds that are 
less affected by crowding and nebulosity, although scalings to include disk-free subpopula¬ 
tions may vary. Nevertheless, the high surface densities seen in other MSFRs indicate that 
the results for the Orion molecular clouds are not anomalous, and may indeed be typical 
for most stars, in the context of Galactic star-formation. The initial cluster mass func¬ 
tion described by a power-law with index of approximately -2 over a large cluster mass range 
( Lada &; Ladall2003 : Chandar et al. 2010 : Portegies Zwart et al. 2010 : Lada 2010 : Rvon et al. 


2014j; Fouesneau et al. 2014), would imply that stars are roughly equally likely to form in 
complexes of different masses. This would mean that star formation is evenly spread be¬ 
tween small Taurus/Chamaelion-type clouds with tens-to-hundreds of stars, smaller OB star¬ 
forming regions like NGC 2264 containing hundreds to thousands of stars, giant molecular 
clouds like the Orion compelex containing thousands to 10 4 stars, and major star-forming 
complexes like Carina containing 10 4 -10 5 stars. Since massive stars appear in the second 
group, most stars are likely born in regions containing or influenced by massive stars. 
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Fig. 8.— Left: Points mark the positions of YSOs from IMegeath et al.l ( 20121 ) in the Orion 
A and B molecular clouds, which meet the BIO selection criteria and lie outside the Orion 
Nebula and Flame Nebula MYStlX fields (indicated by polygons). Right: The X* histograms 
for BIO objects and MYStlX objects are shown by blue and red lines, respectively. The 
coadded BlO+MYStlX histogram is shown with a dashed line. 
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Fig. 9.— For the 17' x 17' MYStlX field of view for the Flame Nebula, X* distributions 
are shown which are inferred using only IR-excess selected members (BIO method; blue 
histogram) and IR-excess plus X-ray selected members (MYStlX; red histogram). 
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5. Models of Subclustered and Unclustered Stars 


In Paper I, we used “finite mixture models” to subdivide the observed populations of 
stars in the MYStlX MSFRs into statistical clusters and “unclustered” uniformly distributed 
components This is a common method of cluster analysis which provides “soft” cluster 
assignments for individual stars—probabilities that a star belongs to a particular subpopula¬ 
tion. Our models, which estimate the spatial surface density in star-forming regions, are the 
composite of multiple parametric models for each of the clustered and unclustered subpop¬ 
ulations. The subclusters were modeled using ellipsoidal surface den sity profiles, similar t o 
the model forms investigated bv iHillenbrand fc Hartmannl (1199811 and lPfalzner et al.l (120121 1 . 
and an additional spatially-constant component was added to account for stars that are not 
part of any subcluster. The number of subclusters, the subcluster parameters, and the num¬ 
ber of stars in each component were determined by model fitting with model selection based 
on the widely-used Akaike Information Criterion penalized likelihood measure. This method 
requires a parametric form to be assumed for the subcluster models, and assumes that stars 
which are not part of subclusters are uniformly distributed, which are not necessarily accu¬ 
rate assumptions; however, the best-fit results in Paper I show that these models are able to 
reproduce the observed surface density distributions with low-amplitude residual maps. 


Table [3] gives the intrinsic populations for the finite-mixture-model “unclustered” 


com¬ 


ponent. Corrections for sample incompleteness of this component were obtained using the 
same methods described in Section 12.1.21 to obtain intrinsic populations for subclusters in 
Table [2j The numbe r of unclustered stars from jrur analysis differs from the results of 


Feigelson et al.l (2011)) in Carina and Feigelson et al. (2009) in NGC 6334 because those 
studies use a threshold method, while we use the mixture model method. Table 0 also 
reports the fraction of stars belonging to the unclustered component. 


In the models 80-90% of the young stellar populations in the MYStlX fields of view 
are members of subclusters, while 10-20% of the young stellar populations are part of a 
component that is approximately uniformly distributed. The stars that are part of the latter 
component could be made up of stars that formed in relatively isolated parts of a MSFR, 
an earlier generation of star-formation that has been dispersed, stars that drifted away from 
subclusters, or clumps of stars with too few members to be identified as subclusters. Some 
of the complexes with larger fractions of unclustered stars include younger regions with more 
active star formation, for example W 40 (20% unclustered), NGC 2264 (18% unclustered), 
and the Rosette Molecular Cloud (22% unclustered). This may be the result of young stars 


10 There are multiple different definitions for “young stellar cluster” in the literature, and here we use 
cluster in the statistical sense. 
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forming in networks of molecular filaments outside the main cluster (e.g., Andre et alj 2010), 
which could^ later become incorporated into the cluster in a hierarchical merger scenario 
( Maschberger et ah 2010). In contrast, for DR 21 (13% unclustered) the youngest objects 
tend to be embedded in the dense filament passing through the center of the cluster, while 
the unclustered objects have an older median age flGetman et ah 2014 bl). The consistently 
low fraction of stars in the unclustered components of the MYStlX regions highlights that 
the distribution of stars is very clumpy, rather than spatially smooth. 

The fraction of stars in subclusters, combined with information about subcluster prop¬ 
erties (whether or not the MYStlX subclusters are gravitationally bound is investigated in 
Paper III), will help characterize the future evolution of the MYStlX MSFRs. The fraction 
of stars that are born into groups that are initially gravitationally bound has been related 
to the fraction of star formation that results in bound open clusters (jKruiissenl 2012). 
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Tablc 3. Clustered and Unclustered Young Stellar Populations 


Subcluster 

(1) 

-^uncl 

(stars) 

(2) 

-^clust 

(stars) 

( 3 ) 

% clustered 

(%) 

( 4 ) 

Orion 

170 

2400 


93 

Flame 


800 



W 40 

100 

420 


80 

RCW 36 

35 

510 


93 

NGC 2264 

340 

1600 


82 

Rosette 

560 

1900 


78 

Lagoon 


3800 



NGC 2362 

26 

570 


96 

DR 21 

370 

2500 


87 

RCW 38 


9900 



NGC 6334 

760 

8600 


92 

NGC 6357 


12000 



Eagle 


8100 



M 17 


16000 



Carina 

2700 

31000 


91 

Trifid 

360 

2700 


88 

NGC 1893 

67 

4500 


98 


Note. — Properties of total intrinsic sub¬ 
clustered and unclustered young stellar popula¬ 
tions. Column 1: Region name. Column 2: In¬ 
trinsic number of young stars in the unclustered 
component (integrated over the whole field of 
view). Column 3: Sum of intrinsic subcluster 
populations for a region (integrated over the 
whole held of view). Column 4: Fraction of 
total young stellar population in subclusters. 
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6. Summary 

We present total intrinsic populations and their surface-density distributions for 17 
massive star-forming regions using catalogs of young stars from the MYStlX project. 

• The analysis of X-ray luminosities of stars in 17 star-forming regions shows results 
that are consistent with the hypothesis of a “universal XLF” for PMS stars. Figure [H 
displays the cumulative distributions of 0.5-8.0 keV, absorption-corrected X-ray lumi¬ 
nosities for the ~17,000 MPCM stars in our sample. The ONC XLF from the COUP 
study is still the only XLF in our study complete down to 0.1-0.2 M 0 . However, the 
other regions show close agreement with the ONC over the range of X-ray luminosities 
for which their intrinsic XLF is observed. 

• The COUP XLF and standard near-IR Initial Mass Functions are used to extrapolate 
total intrinsic numbers of stars in the MYStlX fields of view, which are given in Table |Tj 
(Column 3). The total intrinsic numbers of stars in individual subclusters from Paper I 
are provided in Table [2] About ~16% of the full population appears in the MYStlX 
samples, but detection fractions vary from region to region. 

• There is consistency between total number of stars calculated from XLF extrapolation 
and calculated from IMF extrapolation (Figure [4]). There is little systematic offset in 
total populations calculated by these two methods, but the root-mean-square scatter 
in the relation is 0.25 dex. This result validates the use of the XLF to infer intrinsic 
total populations of stars. 

• Intrinsic stellar surface-density maps are provided for 17 star-forming regions, plotted 
using the same physical scale and surface density scale (Figure [5j). This set of maps 
provides one of the best and least biased views available today of the spatial distribu¬ 
tions of stars formed in giant molecular clouds over the past few million years. Stars 
are highly clustered within these fields, with subclusters of stars similar in size and 
density to the ONC existing in several of the complexes. These surface density maps 
are available in FITS format in the online edition of this paper. 

• The highest surface densities in the MYStlX regions are found in the core of RCW 38 
(>30,000 stars pc -2 ). Another notable high-surface-density star-forming region is 
M 17, which has an unusually large area with >1000 stars pc -2 . The other regions 
containing stars in similar or greater numbers to M 17 (NGC 6357 and Carina) have 
much smaller high-surface-density regions. 
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• MYStlX finds surface densities in massive star-forming regions ranging from ~1 to 
~30,000 stars pc -2 , exceeding the surface density of the ONC Trapezium core by a 
factor of 2. Peaks in the logarithmically binned histograms of surface density vary 
from region to region, with some peaked near 22 stars pc~ 2 similar to BlO’s results 
(e.g., Rosette), but with others having surface-density modes between 200 stars pc -2 
(e.g., NGC 1893) and 2,000 stars pc -2 (e.g., M 17). The variation in the shape of these 
distributions indicates that there is no universal surface-density distribution applicable 
to all types of star-forming regions and no special value of surface density characteristic 
of Galactic star formation. 

• In the MYStlX regions, more than half of the young stars lie in regions with high surface 
densities with 100-10,000 stars pc -2 . Given that most stars form in star-forming regions 
containing O-type stars, this result suggests that a large fraction of stars in the Galaxy 
formed in such dense environments. Nevertheless, a quantitative determination of how 
great this contribution would be would require an unbiased survey of many different 
star-forming environments. We also demonstrate that for the combined young stellar 
population of the Orion molecular clouds, including ~70% of the stars used in BlO’s 
study, the peak of the logarithmically binned distribution is >1000 stars pc~ 2 , not 
~20 stars pc -2 as they find for their subsample (Figure [SJ). 

• Within the MYStlX fields of view, 80-90% of stars belong to subclusters identified in 
Paper I and 10-20% of stars belong to the uniformly distributed component. These 
values, combined with information about gravitational boundedness of subclusters, 
have implications for cluster survival models. 

In theoretical studies of star formation, it is important not to underestimate the num¬ 
ber of stars born in high-density regions where interactions between stars and cluster 
dynamics can have important effects on binary distributions, few-body stellar interac¬ 
tions, and cluster survival. We have demonstrated that X-ray selection of young stars 
can be very helpful in this regard, detecting several times more stars than samples 
based on IR-excess disks and allowing more accurate estimation of the populations of 
dense clusters. 
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